INDEX
Explanations
was or were describing origins
New Auto-Interp
Negative Logits
Currently
0.54
目前
0.52
сейчас
0.52
atualmente
0.50
Currently
0.50
현재
0.49
目前
0.49
obecnie
0.49
attualmente
0.47
ตอนนี้
0.46
POSITIVE LOGITS
originally
0.82
originally
0.67
originalmente
0.59
instrumental
0.55
able
0.54
reportedly
0.52
ursprünglich
0.49
Originally
0.49
Originally
0.48
damals
0.48
Activations Density 0.077%