INDEX
Explanations
politically charged language and references to historical events or figures
New Auto-Interp
Negative Logits
Essex
-0.88
Hudson
-0.85
DAN
-0.83
Hudson
-0.83
MEL
-0.80
Hawkes
-0.80
Essex
-0.77
Aubrey
-0.77
MEL
-0.77
Mel
-0.76
POSITIVE LOGITS
Kalam
0.87
Jackson
0.86
Amin
0.82
Jungkook
0.82
Coimbatore
0.78
Minne
0.77
Jimin
0.76
Jackson
0.74
JACKSON
0.74
Ackerman
0.73
Activations Density 1.536%