INDEX
Explanations
words related to political ideologies
words related to various academic or scholarly topics
New Auto-Interp
Negative Logits
VPN
-0.74
ËĪ
-0.73
haar
-0.72
mire
-0.70
âĵĺ
-0.69
IRD
-0.69
limit
-0.67
IDENT
-0.66
robe
-0.65
gat
-0.65
POSITIVE LOGITS
ity
1.04
ities
0.94
ical
0.91
Pwr
0.90
ization
0.82
ité
0.80
onduct
0.79
ism
0.78
ized
0.78
abama
0.75
Activations Density 0.020%