INDEX
Explanations
phrases indicating uncertainty or speculation
apparently factual statements
New Auto-Interp
Negative Logits
confronti
-0.56
ษา
-0.54
ScopeManager
-0.52
eradish
-0.50
psack
-0.50
Renewal
-0.48
Drink
-0.48
HasFactory
-0.47
Dog
-0.47
Knife
-0.47
POSITIVE LOGITS
apparently
0.84
apparently
0.82
Apparently
0.80
Apparently
0.77
cheinend
0.56
aparentemente
0.52
presumably
0.51
umably
0.51
AppData
0.51
evidently
0.50
Activations Density 0.009%