INDEX
Explanations
references to communities and their leaders
New Auto-Interp
Negative Logits
sacked
-0.45
Veracruz
-0.45
Cleared
-0.43
istung
-0.41
Hell
-0.40
🏾
-0.40
Fayetteville
-0.40
Fußballspieler
-0.40
lenie
-0.39
Fired
-0.39
POSITIVE LOGITS
Jewish
0.60
jewish
0.60
jewish
0.59
Jewish
0.54
Rabbi
0.54
synag
0.52
Rabbi
0.52
himo
0.51
Talmud
0.51
juda
0.50
Activations Density 0.183%