二分查找算法
python的bisect库

概述

对于一个有序序列,我们通常使用二分查找的方式来定位某一元素

二分查找

def binary_search(array, item):
    low = 0
    high = len(array) - 1
    while low <= high:   # 注意
        mid = (low + high) // 2
        guess = array[mid]
        if guess == item:
            return mid  # 找到则返回索引
        elif guess < item:
            low = mid + 1
        else:
            high = mid - 1
    return None

if __name__ == '__main__':
    l = [1, 2, 3, 15, 18, 19]
    print(binary_search(l, 3))
  1. 为什么写成while low <= high
    因为初始化 high 的赋值是 len(array) - 1,即最后一个元素的索引。相当于搜索区间是 [low, high],退出循环的条件是搜索区间为空。这里是low==high+1时,即最后搜索空间[high+1, high]为空退出循环,满足条件。
    如果写成while low < high,则退出条件就变成了low==high,即最后搜索空间[high, high]为非空,但此时循环此时已经退出了,代入数字2,即[2,2]就会导致漏掉索引2没有搜索
  2. 如果写成while low < high,该怎么改写
def binary_search(array, item):
    low = 0
    high = len(array)
    while low < high:
        mid = (low + high) // 2
        guess = array[mid]
        if guess == item:
            return mid
        elif guess < item:
            low = mid + 1
        else:
            high = mid
    return None

if __name__ == '__main__':
    l = [1, 2, 3, 15, 18, 19]
    print(binary_search(l, 3))

二分插入

在有序序列中插入指定数字

def binary_insert(array, item):
    low = 0
    high = len(array)
    while low < high:
        mid = (low + high) // 2
        guess = array[mid]
        if guess < item:
            low = mid + 1
        else:
            high = mid
    array.insert(low, item)
    return array

if __name__ == '__main__':
    l = [1, 2, 3, 15, 18, 19]
    print(binary_insert(l, 19))
    l = [1, 2, 3, 15, 18, 19]
    print(binary_insert(l, 1))
    l = [1, 2, 3, 15, 18, 19]
    print(binary_insert(l, 20))
    l = [1, 2, 3, 15, 18, 19]
    print(binary_insert(l, -1))

bisect

bisect是python标准库,详细见官方文档

import bisect

l = [1, 2, 2, 4, 15, 18, 19]
print(bisect.bisect(l, 19))
print(bisect.bisect(l, 2))
print(bisect.bisect(l, 4))
print(bisect.bisect_right(l, 2))
print(bisect.bisect_left(l, 2))

bisect.insort(l, 19)
print(l)
l = [1, 2, 2, 3, 15, 18, 19]
bisect.insort(l, 20)
print(l)
l = [1, 2, 2, 3, 15, 18, 19]
bisect.insort_right(l, 20)
print(l)

l = [1, 2, 2, 3, 15, 18, 19]
bisect.insort_right(l, 2)
print(l)

l = [1, 2, 2, 3, 15, 18, 19]
bisect.insort_left(l, 2)
print(l)

我们可以来看看源码:

  • bisect.bisect_left(a, x, lo=0, hi=None)
def bisect_left(a, x, lo=0, hi=None):
    """Return the index where to insert item x in list a, assuming a is sorted.

    The return value i is such that all e in a[:i] have e < x, and all e in
    a[i:] have e >= x.  So if x already appears in the list, a.insert(x) will
    insert just before the leftmost x already there.

    Optional args lo (default 0) and hi (default len(a)) bound the
    slice of a to be searched.
    """

    if lo < 0:
        raise ValueError('lo must be non-negative')
    if hi is None:
        hi = len(a)
    while lo < hi:
        mid = (lo+hi)//2
        if a[mid] < x: lo = mid+1
        else: hi = mid
    return lo
  • binsect.insort_right(a, x, lo=0, hi=None)
def insort_right(a, x, lo=0, hi=None):
    """Insert item x in list a, and keep it sorted assuming a is sorted.

    If x is already in a, insert it to the right of the rightmost x.

    Optional args lo (default 0) and hi (default len(a)) bound the
    slice of a to be searched.
    """

    lo = bisect_right(a, x, lo, hi)
    a.insert(lo, x)

可以看出,源码实现跟我们自己实现的二分插入基本一致

使用bisect完成搜索功能

from bisect import *


def index(a, x):
    'Locate the leftmost value exactly equal to x'
    i = bisect_left(a, x)
    if i != len(a) and a[i] == x:
        return i
    raise ValueError


def find_lt(a, x):
    'Find rightmost value less than x'
    i = bisect_left(a, x)
    if i:
        return a[i - 1]
    raise ValueError


def find_le(a, x):
    'Find rightmost value less than or equal to x'
    i = bisect_right(a, x)
    if i:
        return a[i - 1]
    raise ValueError


def find_gt(a, x):
    'Find leftmost value greater than x'
    i = bisect_right(a, x)
    if i != len(a):
        return a[i]
    raise ValueError


def find_ge(a, x):
    'Find leftmost item greater than or equal to x'
    i = bisect_left(a, x)
    if i != len(a):
        return a[i]
    raise ValueError

扩展阅读