INDEX
Explanations
mentions of government officials or political activities
mentions of the word "Min" or anything related to the concept of minimalism or measurement
New Auto-Interp
Negative Logits
enegger
-0.84
fitting
-0.84
atered
-0.81
ä½ľ
-0.73
schild
-0.70
caliphate
-0.64
razor
-0.64
hemoth
-0.63
llah
-0.62
fits
-0.62
POSITIVE LOGITS
neapolis
1.22
utes
1.11
ority
0.93
oby
0.92
erva
0.92
otaur
0.91
isters
0.90
ISTER
0.86
nesota
0.84
ister
0.83
Activations Density 0.005%