INDEX
Explanations
significant phrases indicating introspection or reflection
references to specific events, times, and contextual details related to history or observations
New Auto-Interp
Negative Logits
NBA
-0.59
rightfully
-0.58
oÄŁan
-0.56
Remain
-0.56
uncond
-0.55
specifically
-0.55
unamb
-0.54
leneck
-0.53
essentially
-0.53
aganda
-0.53
POSITIVE LOGITS
itiz
0.66
occurs
0.65
Aires
0.65
tarians
0.64
pires
0.63
happens
0.62
nces
0.62
ses
0.62
ishes
0.62
WARE
0.61
Activations Density 0.679%