INDEX
Explanations
verbs in base form after the word 'can'
negations or phrases expressing inability or restrictions
New Auto-Interp
Negative Logits
Reck
-0.65
Wol
-0.64
intending
-0.63
Cly
-0.60
Notting
-0.59
Kub
-0.59
Respons
-0.58
guiActiveUnfocused
-0.56
Fell
-0.55
Oz
-0.55
POSITIVE LOGITS
estine
0.86
atche
0.86
adian
0.78
safely
0.78
osta
0.77
rouse
0.76
apesh
0.74
muster
0.74
atell
0.74
withstand
0.74
Activations Density 0.745%