INDEX
Explanations
descriptions of ongoing activities or processes
New Auto-Interp
Negative Logits
ns
-0.67
TY
-0.66
ptive
-0.64
ags
-0.63
apolis
-0.62
meg
-0.61
ulous
-0.60
zed
-0.60
oric
-0.60
aign
-0.58
POSITIVE LOGITS
side
1.08
wagon
0.78
side
0.74
Side
0.73
Aven
0.70
shore
0.69
isan
0.69
*=-
0.68
Clause
0.66
Came
0.66
Activations Density 1.139%