INDEX
Explanations
words or phrases indicating abrupt changes or surprising events
New Auto-Interp
Negative Logits
ⓧ
-0.77
للمعارف
-0.71
Culver
-0.70
<<<<<<<<<<<<<<
-0.67
ritch
-0.64
frutto
-0.64
ArrowToggle
-0.63
câte
-0.63
jmniej
-0.63
primaire
-0.62
POSITIVE LOGITS
suddenly
1.28
Sudden
1.20
Suddenly
1.14
Sudden
1.13
Suddenly
1.09
sudden
1.02
suddenly
1.02
SUD
1.00
SUD
0.92
sud
0.86
Activations Density 0.006%