Python Tokenizer

The Python tokenizer provides character-by-character streaming tokenization for Python source code.

Supported Token Types

Usage

using NTokenizers.Python;

var tokenizer = PythonTokenizer.Create();
var tokens = tokenizer.Parse("def greet(name: str) -> str: return f\"Hello, {name}!\"");

foreach (var token in tokens)
{
    Console.WriteLine($"{token.TokenType}: {token.Value}");
}

Streaming

await PythonTokenizer.Create().ParseAsync(stream, onToken: token =>
{
    // Handle tokens as they arrive
});

Markdown Integration

Python code blocks are automatically recognized in Markdown:

```python
def greet(name: str) -> str:
    return f"Hello, {name}!"
```

The MarkdownTokenizer will use the Python tokenizer to tokenize the code block content.

Limitations

"