INDEX
Explanations
proper nouns related to individuals
references to specific individuals and mentions of spam
New Auto-Interp
Negative Logits
GA
-0.73
heats
-0.70
erness
-0.70
erm
-0.69
pill
-0.67
provision
-0.66
HTTP
-0.65
basketball
-0.64
HTTP
-0.64
road
-0.63
POSITIVE LOGITS
Schneider
1.81
Vincent
1.54
Viktor
1.34
Echo
1.34
Sven
1.21
Spect
1.20
spam
1.18
Pierre
1.08
Spect
1.02
Cousins
1.01
Activations Density 0.029%