INDEX
Explanations
elements related to specific individuals or names, particularly those associated with organizations or movements
New Auto-Interp
Negative Logits
thirds
-0.78
mble
-0.75
querque
-0.73
rums
-0.70
Rebels
-0.69
teenth
-0.69
Erica
-0.69
taboola
-0.67
leneck
-0.67
checks
-0.67
POSITIVE LOGITS
oint
1.11
OY
1.04
avascript
0.93
osit
0.92
EG
0.91
ealous
0.89
OE
0.88
JP
0.88
IT
0.87
NI
0.87
Activations Density 0.005%