YAML Tokenizer

The YAML tokenizer is designed to parse YAML code and break it down into meaningful components (tokens) for processing. It provides stream-capable functionality for handling large YAML files or real-time YAML data analysis.

Please note: Will be part of v1.1 and later

Overview

The YAML tokenizer is part of the NTokenizers library and provides a stream-capable approach to parsing YAML code. It can process YAML source code in real-time, making it suitable for large files or streaming scenarios where loading everything into memory at once is impractical.

Public API

The YAML tokenizer inherits from BaseSubTokenizer<YamlToken> and provides the following key methods:

Usage Examples

Basic Usage with Stream

using NTokenizers.Yaml;
using Spectre.Console;
using System.Text;

string yamlCode = """
    ---
    name: Alice Smith
    age: 30
    active: true
    hobbies:
      - reading
      - coding
    ...
    """;

using var stream = new MemoryStream(Encoding.UTF8.GetBytes(yamlCode));
await YamlTokenizer.Create().ParseAsync(stream, onToken: token =>
{
    var value = Markup.Escape(token.Value);
    var colored = token.TokenType switch
    {
        YamlTokenType.DocumentStart => new Markup($"[yellow]{value}[/]"),
        YamlTokenType.DocumentEnd => new Markup($"[yellow]{value}[/]"),
        YamlTokenType.Key => new Markup($"[cyan]{value}[/]"),
        YamlTokenType.Colon => new Markup($"[yellow]{value}[/]"),
        YamlTokenType.Value => new Markup($"[green]{value}[/]"),
        YamlTokenType.Comment => new Markup($"[grey]{value}[/]"),
        YamlTokenType.Quote => new Markup($"[green]{value}[/]"),
        YamlTokenType.String => new Markup($"[green]{value}[/]"),
        YamlTokenType.Anchor => new Markup($"[magenta]{value}[/]"),
        YamlTokenType.Alias => new Markup($"[magenta]{value}[/]"),
        YamlTokenType.Tag => new Markup($"[orange1]{value}[/]"),
        YamlTokenType.FlowSeqStart => new Markup($"[yellow]{value}[/]"),
        YamlTokenType.FlowSeqEnd => new Markup($"[yellow]{value}[/]"),
        YamlTokenType.FlowMapStart => new Markup($"[yellow]{value}[/]"),
        YamlTokenType.FlowMapEnd => new Markup($"[yellow]{value}[/]"),
        YamlTokenType.FlowEntry => new Markup($"[yellow]{value}[/]"),
        YamlTokenType.BlockSeqEntry => new Markup($"[yellow]{value}[/]"),
        YamlTokenType.Whitespace => new Markup($"{value}"),
        _ => new Markup(value)
    };
    AnsiConsole.Write(colored);
});

Using with TextReader

using NTokenizers.Yaml;
using System.IO;

string yamlCode = """
    name: John
    age: 30
    """;
using var reader = new StringReader(yamlCode);
await YamlTokenizer.Create().ParseAsync(reader, onToken: token =>
{
    Console.WriteLine($"Token: {token.TokenType} = '{token.Value}'");
});

Parsing String Directly

using NTokenizers.Yaml;

string yamlCode = """
    name: Alice
    active: true
    """;
var tokens = YamlTokenizer.Create().Parse(yamlCode);
foreach (var token in tokens)
{
    Console.WriteLine($"Token: {token.TokenType} = '{token.Value}'");
}

Token Types

The YAML tokenizer produces tokens of type YamlTokenType with the following token types:

More info: YamlTokenType.cs

Features

See Also

"