INDEX
Explanations
expressions of belief or opinion
New Auto-Interp
Negative Logits
حوالہ
-0.53
pleaſure
-0.52
sū
-0.52
تمثيل
-0.50
مني
-0.48
colazione
-0.48
unterschiedlich
-0.47
Diſ
-0.47
faj
-0.46
وام
-0.46
POSITIVE LOGITS
glaube
0.94
perhaps
0.94
mutlich
0.93
probably
0.91
maybe
0.90
Probably
0.87
probablemente
0.87
perhaps
0.84
chyba
0.84
provavelmente
0.83
Activations Density 0.111%