INDEX
Explanations
emotional and personal reflections or expressions
New Auto-Interp
Negative Logits
mony
-0.66
noon
-0.65
mon
-0.62
ierrez
-0.60
hma
-0.59
artisan
-0.57
jury
-0.57
geoning
-0.57
ikers
-0.56
rontal
-0.56
POSITIVE LOGITS
fuss
0.95
entails
0.82
wrought
0.81
boils
0.78
entail
0.78
Means
0.77
Learned
0.77
meant
0.77
DERR
0.76
ãĤ´
0.75
Activations Density 0.741%