INDEX
Explanations
words and phrases that indicate agreement or consensus
New Auto-Interp
Negative Logits
leider
-0.16
icc
-0.16
ibe
-0.14
åĬ¹
-0.13
ãĤ¦ãĥ³
-0.13
igu
-0.13
ızı
-0.12
apologies
-0.12
ubar
-0.12
�t
-0.12
POSITIVE LOGITS
according
0.84
according
0.69
According
0.66
According
0.62
ccording
0.55
según
0.49
selon
0.48
æł¹æį®
0.46
accordance
0.42
podle
0.42
Activations Density 0.028%