INDEX
Explanations
phrases indicating a serious or critical situation that requires action
phrases referencing situations or examples
New Auto-Interp
Negative Logits
terness
-0.77
istries
-0.76
ãĥ´ãĤ¡
-0.74
Ni
-0.68
erity
-0.66
ufact
-0.65
Austral
-0.65
Volume
-0.63
ãĥ³ãĤ¸
-0.62
verning
-0.61
POSITIVE LOGITS
guiActiveUnfocused
0.61
Untitled
0.59
plague
0.56
anka
0.56
ï
0.55
Camer
0.55
ours
0.55
Quote
0.54
focal
0.54
cellence
0.54
Activations Density 0.071%