INDEX
Explanations
mentions or references to specific names, likely related to news or media coverage
names of individuals and locations
New Auto-Interp
Negative Logits
ISA
-0.81
SPONSORED
-0.78
SPA
-0.73
acebook
-0.71
srfAttach
-0.66
Applications
-0.64
SAY
-0.63
MAP
-0.61
Ü
-0.60
leness
-0.60
POSITIVE LOGITS
vous
1.05
ÅĤ
0.76
hur
0.73
án
0.72
ovsky
0.70
Ö¼
0.68
henko
0.67
ooth
0.67
owicz
0.67
opol
0.66
Activations Density 0.302%