INDEX
Explanations
occurrences of specific events or incidents
occurrences of the word "happened" in various contexts
New Auto-Interp
Negative Logits
tan
-0.90
oyal
-0.83
uffed
-0.82
entric
-0.81
hart
-0.77
mens
-0.77
illusion
-0.76
ench
-0.76
ofer
-0.74
rylic
-0.74
POSITIVE LOGITS
unfolding
0.79
Procedure
0.78
happened
0.76
yesterday
0.76
wrong
0.72
Myster
0.71
Ukrain
0.71
occ
0.71
perpetrated
0.71
Pax
0.70
Activations Density 0.037%