Here is a representative list of languages where models like Gemini generally demonstrate high proficiency. This is not exhaustive, but covers many of the most widely spoken and digitally represented languages:
Languages with Generally High Proficiency:
- Afrikaans
- Albanian
- Amharic
- Arabic
- Armenian
- Azerbaijani
- Basque
- Belarusian
- Bengali
- Bosnian
- Bulgarian
- Burmese
- Catalan
- Cebuano
- Chinese (Simplified & Traditional)
- Corsican
- Croatian
- Czech
- Danish
- Dutch
- English (Typically the most proficient)
- Esperanto
- Estonian
- Filipino (Tagalog)
- Finnish
- French
- Frisian (West)
- Galician
- Georgian
- German
- Greek
- Gujarati
- Haitian Creole
- Hausa
- Hebrew
- Hindi
- Hmong
- Hungarian
- Icelandic
- Igbo
- Indonesian
- Irish
- Italian
- Japanese
- Javanese
- Kannada
- Kazakh
- Khmer
- Kinyarwanda
- Korean
- Kurdish (Kurmanji)
- Kyrgyz
- Lao
- Latin
- Latvian
- Lithuanian
- Luxembourgish
- Macedonian
- Malagasy
- Malay
- Malayalam
- Maltese
- Maori
- Marathi
- Mongolian
- Nepali
- Norwegian (Bokmål)
- Nyanja (Chichewa)
- Odia (Oriya)
- Pashto
- Persian (Farsi)
- Polish
- Portuguese (Portugal, Brazil)
- Punjabi (Gurmukhi, Shahmukhi)
- Romanian
- Russian
- Samoan
- Scots Gaelic
- Serbian (Cyrillic & Latin)
- Sesotho
- Shona
- Sindhi
- Sinhala
- Slovak
- Slovenian
- Somali
- Spanish
- Sundanese
- Swahili
- Swedish
- Tajik
- Tamil
- Tatar
- Telugu
- Thai
- Turkish
- Turkmen
- Ukrainian
- Urdu
- Uyghur
- Uzbek
- Vietnamese
- Welsh
- Xhosa
- Yiddish
- Yoruba
- Zulu
Important Considerations:
- Beyond this List: The model likely has some level of understanding (e.g., for translation) of many more languages and dialects not listed here, especially those closely related to the ones above or those with a reasonable amount of text available online.
- Quality Varies: Performance in tasks like nuanced text generation, understanding complex idioms, or capturing cultural context will be significantly better for languages higher up the list (like English, Spanish, French, German, Chinese) than for less digitally represented languages.
- Task-Dependent: Proficiency might be higher for translation between languages than for generating original, creative text in a less common language.
So, yes, your intuition that it’s more than 100 languages is correct when considering the broad range of capabilities from basic understanding and translation up to high-fidelity generation. The list above represents those where you can generally expect a strong level of performance.