INDEX
Explanations
names or terms related to linguistics
text or phrases related to identifying information, particularly names and specific terminology in discussions about complex subjects
New Auto-Interp
Negative Logits
Coleman
-0.83
Eden
-0.74
Mous
-0.71
Archangel
-0.68
Cole
-0.68
Rahman
-0.66
Ingram
-0.66
Mitchell
-0.66
Liang
-0.65
Lanka
-0.65
POSITIVE LOGITS
sf
2.90
ingu
1.84
sv
1.53
sb
1.40
sd
1.34
sr
1.22
Mayer
1.16
erd
1.11
ansk
1.11
unch
1.09
Activations Density 0.063%