INDEX
Explanations
instances of actions or events that have a significant impact or consequence
instances of the word "act" and its variations, particularly in contexts emphasizing actions or behaviors
New Auto-Interp
Negative Logits
Parad
-0.75
nda
-0.74
Galile
-0.65
lake
-0.63
castle
-0.61
Plate
-0.61
Dise
-0.60
Mous
-0.60
hander
-0.60
Stud
-0.59
POSITIVE LOGITS
ional
1.31
uary
1.30
uated
1.22
ivism
1.13
uation
1.10
uating
1.05
ivity
1.03
uate
1.02
ual
1.00
inic
1.00
Activations Density 0.022%