INDEX
Explanations
mentions of different languages or language-related terms
references to different languages and their usage
New Auto-Interp
Negative Logits
arious
-0.86
urion
-0.86
ocrates
-0.86
200000
-0.82
rotein
-0.79
arching
-0.78
ptions
-0.76
Tesla
-0.76
arcity
-0.74
uristic
-0.73
POSITIVE LOGITS
equivalents
0.83
learners
0.81
proficiency
0.81
bilingual
0.79
immersion
0.76
supremacy
0.76
dictionary
0.75
spoken
0.75
interpre
0.73
hegemony
0.73
Activations Density 0.043%