处理字典中不存在的key的几种方式

in、get、KeyError

一个城市投票系统,我们可以创建一个字典,让对应的票数和城市关联起来,如果键不在字典中,则默认是0,然后在增加对应的票数

  • 直接给对应的票数+1,显然如果键不存在则会报KeyError错误
counters = {'shanghai': 9, 'beijing': 8}
key = 'hangzhou'
counters[key] += 1  # KeyError: 'hangzhou'

print(counters)
  • in:先使用if...in来判断key是否在字典中
counters = {'shanghai': 9, 'beijing': 8}

key = 'hangzhou'
if key in counters:
    count = counters[key]
else:
    count = 0

counters[key] = count+1
print(counters)
############使用not in###############
counters = {'shanghai': 9, 'beijing': 8}
key = 'hangzhou'

if key not in counters:
    counters[key] = 0
counters[key] += 1

print(counters)

  • 利用keyError异常
counters = {'shanghai': 9, 'beijing': 8}
key = 'hangzhou'
try:
    count = counters[key]
except KeyError:
    count = 0
counters[key] = count + 1

print(counters)
  • 使用get:相比上面的两种方式,使用get方法更为常见,使用get方法的第二个参数来指定键不存在时返回的默认值
counters = {'shanghai': 9, 'beijing': 8}
key = 'hangzhou'
count = counters.get(key, 0)
counters[key] = count + 1
print(counters)

setdefault

  • setdefault:如果key不在dict中,则插入该key,并给定一个默认value;如果key在dict,则返回该key对应的value,否则为给定的默认value
d = {'name': 'jerry'}
age = d.setdefault('age', 18)
print(age)  # 18
print(d)  # {'name': 'jerry', 'age': 18}
################################
name = d.setdefault('name', 'tom')
print(name)  # jerry
print(d)  # {'name': 'jerry', 'age': 18}
cities = {'anhui': ['hefei', 'wuhu'], 'zhejiang': ['hangzhou']}
prov1 = 'jiangsu'
city1 = 'nanjing'
prov2 = 'zhejiang'
city2 = 'shaoxing'
prov3 = 'guangdong'
city3 = 'guangzhou'
prov_city1 = cities.setdefault(prov1, [])
print(cities)
prov_city1.append(city1)
print(cities)
cities.setdefault(prov2, []).append(city2)
cities.setdefault(prov3, []).append(city3)
print(cities)
##############输出结果############3
{'anhui': ['hefei', 'wuhu'], 'zhejiang': ['hangzhou'], 'jiangsu': []}
{'anhui': ['hefei', 'wuhu'], 'zhejiang': ['hangzhou'], 'jiangsu': ['nanjing']}
{'anhui': ['hefei', 'wuhu'], 'zhejiang': ['hangzhou', 'shaoxing'], 'jiangsu': ['nanjing'], 'guangdong': ['guangzhou']}

使用该方式,有个弊端,即无论省份prov是否在字典中都会创建新的list实例

defaultdict

  • 如果你管理的字典可能需要添加任意的键,可以考虑使用内置的collections模块中的defaultdict实例来解决问题
  • defaultdict(default_factory[, ...]):当 key 不存在的时候,会调用工厂函数default_factory来生成 key 对应的 value,default_factory只能是不需要参数的函数
class defaultdict(dict):
    def __init__(self, default_factory=None, **kwargs): # known case of _collections.defaultdict.__init__
        """
        defaultdict(default_factory[, ...]) --> dict with default factory
        
        The default factory is called without arguments to produce
        a new value when a key is not present, in __getitem__ only.
        A defaultdict compares equal to a dict with the same items.
        All remaining arguments are treated the same as if they were
        passed to the dict constructor, including keyword arguments.
        
        # (copied from class doc)
        """
from collections import defaultdict
cities = {'anhui': ['hefei', 'wuhu'], 'zhejiang': ['hangzhou']}

cities = defaultdict(list, cities)
# print(cities)
# print(cities['guangdong'])
cities['guangdong'].append('guangzhou')
print(cities)
print(cities.get('guangdong'))
from collections import defaultdict

class Cities:
    def __init__(self):
        self.data = defaultdict(list)

    def add(self, prov, city):
        self.data[prov].append(city)

cities = Cities()
cities.add('anhui', 'hefei')
cities.add('anhui', 'wuhu')
cities.add('jiangsu', 'nanjing')
print(cities.data)

__missing__

  • __missing__:一个在字典中查找不存在的键时被调用的方法,可以重载这个方法来自定义返回值。如果这个方法没有被定义,那么字典在查找不存在的键时会抛出 KeyError 异常
class MyDict(dict):
    def __missing__(self, key):
        default_value = 1000
        self.__dict__[key] = default_value
        return default_value

my_dict = MyDict()
print(my_dict['1'])  # 1000
  • __missing____getitem__
    • __getitem__用于实现 [] 操作符。当我们使用 [] 操作符访问一个字典中的键值对时,实际上就是在调用字典对象的 __getitem__ 方法,当字典对象中访问不存在的键时,如果存在 __missing__ 方法,则会调用该方法。也就是说,只有在使用d[k]形式的语法并且键不存在时才会调用__missing__
class MyDict(dict):
    def __missing__(self, key):
        default_value = 1000
        self.__dict__[key] = default_value
        return default_value

my_dict = MyDict(a=1, b=2)
print(my_dict['c'])  # 1000
print(my_dict.get('c'))  # None
  • 通过get方法调用为什么不返回1000呢,而是None?
class MyDict(dict):
    def __missing__(self, key):
        default_value = 1000
        self.__dict__[key] = default_value
        print(f'missing key: {key}')
        return default_value

    def __getitem__(self, key):
        print(f'getitem key: {key}')
        return 2000

my_dict = MyDict(a=1, b=2)
print(my_dict['c'])  # 2000,调用了__getitem__且不再调用__missing__
print(my_dict.get('c'))  # None
print(my_dict)

get方法不会调用__getitem____missing____getitem__仅适用于d[k]形式的语法

  • 字典的get()方法
# dict的get方法
    def get(self, *args, **kwargs): # real signature unknown
        """ Return the value for key if key is in the dictionary, else default. """
        pass
# 底层实现类似与如下代码:
    def get(self, key, default=None):
        try:
            return self[key]
        except KeyError:
            return default
  • 如果字典构造的默认值必须根据键名来确定,则可以定义自己的dict子类并实现__missing__方法
# 将图片的路径名和相关的文件句柄关联起来
from collections import defaultdict


def open_picture(profile_path):
    try:
        return open(profile_path, 'a+b')
    except OSError:
        print(f'Failed to open path {profile_path}')
        raise


path = 'profile_1234.png'


class Pictures(dict):
    def __missing__(self, key):
        value = open_picture(key)
        self[key] = value
        return value


pictures = Pictures()
# pictures = defaultdict(open_picture)  # 工厂方法不支持带参函数 TypeError: open_picture() missing 1 required positional argument: 'profile_path'
handle = pictures[path]
handle.seek(0)
image_data = handle.read()
print(pictures)

参考及扩展阅读