INDEX
Explanations
expressions related to feelings and emotions in the context of interpersonal relationships
New Auto-Interp
Negative Logits
///</
-0.80
muß
-0.77
läßt
-0.73
pertanto
-0.70
idéia
-0.69
Moslem
-0.69
müßte
-0.67
nunmehr
-0.66
sundry
-0.65
십시오
-0.65
POSITIVE LOGITS
idk
1.31
Idk
1.28
idk
1.25
Idk
1.20
tryna
1.03
irl
0.95
hella
0.94
tbh
0.92
lmao
0.91
eachother
0.89
Activations Density 0.330%