INDEX
Explanations
references to specific URLs and online profiles
references to high levels of activity or mentions related to specific individuals or organizations
New Auto-Interp
Negative Logits
nces
-0.87
proto
-0.67
Izan
-0.66
nomine
-0.66
Minority
-0.64
Phant
-0.63
e
-0.62
pole
-0.62
Debor
-0.61
Palestin
-0.59
POSITIVE LOGITS
stein
1.06
mann
0.97
hl
0.97
uci
0.93
iere
0.91
igh
0.89
endor
0.89
iday
0.89
alm
0.87
uff
0.86
Activations Density 0.008%