INDEX
Explanations
mentions of hotels
references to hotels
New Auto-Interp
Negative Logits
xit
-0.82
Anarchy
-0.78
İĭ
-0.73
advers
-0.70
nces
-0.68
nir
-0.68
umbnail
-0.67
Izan
-0.66
cale
-0.66
Prompt
-0.65
POSITIVE LOGITS
rooms
1.03
accommodations
0.98
hotel
0.93
guests
0.89
room
0.88
hotels
0.88
accommodation
0.87
Hotel
0.87
room
0.82
occupancy
0.82
Activations Density 0.029%