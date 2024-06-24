Multimodal Large Language Models (MLLMs) are fundamentally reshaping our interactions with technology, closing the gap between code and conversation. With the trend towards evolving technology to be more naturally accessible through audio and voice commands, multi-modal AI integration is already on the horizon, with multiple chatbots and AI assistants accepting a wide range of basic voice commands. In the near future, we can expect these capabilities to expand exponentially, with AI platforms accepting multi-modal inputs in the form of text, images, or voice commands, and being able to respond with high levels of detail.