INDEX
Explanations
phrases related to physical descriptions and characteristics
adjectives and descriptors that convey emotional or subjective evaluations
New Auto-Interp
Negative Logits
¬¼
-0.84
swick
-0.78
afety
-0.77
OND
-0.67
uay
-0.67
URA
-0.66
OME
-0.65
thora
-0.62
§
-0.62
UF
-0.61
POSITIVE LOGITS
albeit
1.04
but
0.97
though
0.88
although
0.88
however
0.85
huh
0.83
whereas
0.81
except
0.79
yet
0.74
eg
0.73
Activations Density 0.448%