The Extensible Markup Language (XML) is a complex language, and XML-based protocols, utilized in cloud computing, are susceptible to entire classes of security problems. Message formats in XML-based protocols are usually specified in XML Schema, and as a first line of defense, schema validation should reject syntactically non-acceptable input. However, extension points in many protocol specifications break validation. Extension points are considered best practice for loose composition, but they also enable an attacker to add unchecked content in a document, e.g., for a signature wrapping attack. This thesis presents a security monitor for language-based anomaly detection in XML-based protocols. The contributions are datatyped XML visibly pushdown automata (dXVPAs) as language representation for mixed-content XML and an incremental learner that infers a dXVPA from example documents for the monitor. The learner generalizes XML types and datatypes in terms of automaton states and transitions, and an inferred dXVPA converges to a good-enough approximation of the acceptable language. The automaton is free from extension points and capable of stream validation. For dealing with adversarial training data, i.e., poisoning attacks, operations for unlearning and sanitization are specified. Unlearning removes an identified poisoning attack from a dXVPA, and sanitization trims low-frequent states and transitions to eliminate hidden attacks. All algorithms were evaluated in four scenarios, including a web service implemented in Apache Axis2 and Apache Rampart, where attacks were simulated. In all scenarios, the learned automaton had zero false positives and outperformed traditional schema validation.