INDEX
Explanations
adjectives describing the degree or possibility of something happening
phrases that express potential actions or possibilities
New Auto-Interp
Negative Logits
Continued
-0.77
oother
-0.69
NPR
-0.64
finished
-0.62
ror
-0.61
amar
-0.61
RECT
-0.60
reassured
-0.59
iets
-0.58
Passive
-0.58
POSITIVE LOGITS
imaginable
1.26
muster
0.96
conceivable
0.91
ãĥij
0.82
possibly
0.82
ãĤ«
0.79
achus
0.74
ever
0.74
ãĥ«
0.73
disposal
0.72
Activations Density 0.153%