Accelerating LLM Inference with Smart NIC Tokenization and Caching