INDEX
Explanations
phrases related to completion or achievement of tasks
phrases indicating potential actions or occurrences
New Auto-Interp
Negative Logits
=~
-0.62
azel
-0.61
alike
-0.59
anything
-0.59
likewise
-0.59
both
-0.58
also
-0.57
etc
-0.57
auga
-0.57
robat
-0.56
POSITIVE LOGITS
marginally
0.95
spor
0.93
insofar
0.90
ONE
0.85
fraction
0.76
peripher
0.76
temporarily
0.74
curs
0.73
fleeting
0.70
briefly
0.69
Activations Density 0.242%