INDEX
Explanations
references to specific names, such as "Tiger" and "Maddow"
mentions of specific individuals or characters involved in notable events or stories
New Auto-Interp
Negative Logits
beit
-0.83
agers
-0.76
iders
-0.71
Radiation
-0.63
PDATE
-0.61
disinfect
-0.60
ufact
-0.60
Scotia
-0.60
polarized
-0.60
magnification
-0.60
POSITIVE LOGITS
psey
0.86
reon
0.83
bies
0.79
pillar
0.77
eas
0.77
avanaugh
0.76
eon
0.75
eem
0.74
rane
0.74
iem
0.74
Activations Density 0.048%