划分字母区间

题目要求

编写一个算法，输入为一个字符串s。目标是将字符串s划分为尽可能多的子串，每个子串中的字符都是唯一的，即同一个字符不能在多个子串中出现。同时，这些子串连接起来后，应该能够还原成原始的字符串s。

算法的输出应该是一个整数列表，列表中的每个元素代表每个子串的长度。

解题思路

为了解决这个问题，我们可以遵循以下步骤：

创建字符最后出现位置的映射：首先遍历字符串s，记录每个字符最后一次出现的索引位置。这将帮助我们确定每个字符能够存在的最远边界。
遍历字符串并划分子串：再次遍历字符串s，这次我们将维护两个变量，一个是当前子串的起始位置start，另一个是当前子串的结束位置end。对于每个遍历到的字符，我们更新end为当前字符最后出现位置和end的较大值。
确定子串的边界：当遍历到的字符索引等于当前end时，说明我们找到了一个子串的边界。这时，我们可以将子串的长度（end - start + 1）添加到结果列表中，并更新start为当前end的下一个位置，继续寻找下一个子串。
重复步骤 2 和 3：继续遍历直到字符串s的末尾，每次找到子串边界时，都将子串长度添加到结果列表中。
返回结果：遍历完成后，返回存储子串长度的列表作为最终结果。

通过以上步骤，我们可以确保每个字符只出现在一个子串中，并且子串的连接顺序与原始字符串s相同。

Golang 版本

package main

import (
	"fmt"
)

func partitionLabels(s string) []int {
	last := make(map[rune]int)
	for i, c := range s {
		last[c] = i
	}

	var result []int
	start, end := 0, 0

	for i, c := range s {
		if last[c] > end {
			end = last[c]
		}
		if i == end {
			result = append(result, end-start+1)
			start = i + 1
		}
	}

	return result
}

func main() {
	s := "ababcbacadefegdehijhklij"
	result := partitionLabels(s)
	fmt.Println(result)
}

这段代码实现了上述的算法思路。首先，它创建了一个映射来记录每个字符最后出现的位置。然后，它遍历字符串，使用start和end变量来追踪当前子串的边界，并在找到子串边界时更新结果列表。最后，它输出了每个子串的长度。

C++版本

#include <iostream>
#include <vector>
#include <string>
#include <unordered_map>

std::vector<int> partitionLabels(std::string s) {
    std::unordered_map<char, int> last;
    for (int i = 0; i < s.length(); ++i) {
        last[s[i]] = i;
    }

    std::vector<int> partitions;
    int start = 0, end = 0;

    for (int i = 0; i < s.length(); ++i) {
        end = std::max(end, last[s[i]]);
        if (i == end) {
            partitions.push_back(end - start + 1);
            start = i + 1;
        }
    }

    return partitions;
}

int main() {
    std::string s = "ababcbacadefegdehijhklij";
    std::vector<int> result = partitionLabels(s);
    for (int len : result) {
        std::cout << len << " ";
    }
    std::cout << std::endl;
    return 0;
}

这段 C++代码实现了相同的逻辑：首先创建一个哈希表来记录每个字符最后出现的位置，然后遍历字符串，使用start和end来确定每个片段的边界，并将片段的长度添加到结果向量中。最后，主函数中打印出每个片段的长度。

Python 版本

def partition_labels(s: str) -> [int]:
    last = {c: i for i, c in enumerate(s)}
    start = end = 0
    result = []

    for i, c in enumerate(s):
        end = max(end, last[c])
        if i == end:
            result.append(end - start + 1)
            start = i + 1

    return result

if __name__ == "__main__":
    s = "ababcbacadefegdehijhklij"
    print(partition_labels(s))

这段 Python 代码遵循了之前描述的算法思路：首先创建一个字典来记录每个字符最后出现的位置，然后遍历字符串，使用start和end来确定每个片段的边界，并将片段的长度添加到结果列表中。最后，在主函数中打印出每个片段的长度。

Java 版本

import java.util.ArrayList;
import java.util.List;

public class Solution {
    public List<Integer> partitionLabels(String s) {
        int[] last = new int[26]; // Assuming 's' contains only lowercase letters.
        for (int i = 0; i < s.length(); i++) {
            last[s.charAt(i) - 'a'] = i;
        }

        List<Integer> partitionLengths = new ArrayList<>();
        int start = 0, end = 0;

        for (int i = 0; i < s.length(); i++) {
            end = Math.max(end, last[s.charAt(i) - 'a']);
            if (i == end) {
                partitionLengths.add(end - start + 1);
                start = i + 1;
            }
        }

        return partitionLengths;
    }

    public static void main(String[] args) {
        Solution solution = new Solution();
        String s = "ababcbacadefegdehijhklij";
        List<Integer> partitionLengths = solution.partitionLabels(s);
        for (int length : partitionLengths) {
            System.out.print(length + " ");
        }
    }
}

这段 Java 代码实现了相同的逻辑：首先创建一个数组来记录每个字符最后出现的位置，然后遍历字符串，使用start和end来确定每个片段的边界，并将片段的长度添加到结果列表中。最后，在main方法中打印出每个片段的长度。

Rust 版本

use std::collections::HashMap;

fn partition_labels(s: String) -> Vec<i32> {
    let mut last_occurrence = HashMap::new();
    let s_chars: Vec<char> = s.chars().collect();

    // Record the last occurrence of each character
    for (i, &item) in s_chars.iter().enumerate() {
        last_occurrence.insert(item, i);
    }

    let mut partitions = Vec::new();
    let mut start = 0;
    let mut end = 0;

    for (i, &item) in s_chars.iter().enumerate() {
        if let Some(&last) = last_occurrence.get(&item) {
            end = end.max(last);
        }

        if i == end {
            partitions.push((end - start + 1) as i32);
            start = i + 1;
        }
    }

    partitions
}

fn main() {
    let s = String::from("ababcbacadefegdehijhklij");
    let partition_lengths = partition_labels(s);
    for length in partition_lengths {
        print!("{} ", length);
    }
}

这段 Rust 代码实现了相同的逻辑：首先创建一个哈希图来记录每个字符最后出现的位置，然后遍历字符串，使用start和end来确定每个片段的边界，并将片段的长度添加到结果向量中。最后，在main函数中打印出每个片段的长度。

总结

上述问题的解法遵循了一个通用的算法思路，无论是在 Python、Java 还是 Rust 语言中。核心思想是利用贪心算法的策略来确定字符串中每个片段的边界。解法的步骤可以总结如下：

记录字符最后出现的位置：首先遍历整个字符串，记录下每个字符最后一次出现的索引位置。这可以通过哈希表（在 Rust 中是HashMap，在 Java 中是数组，因为输入限定为小写字母）来实现。
初始化起始和结束索引：定义两个变量start和end，分别用来标记当前片段的起始和结束位置。初始时，start和end都设置为 0。
遍历字符串并更新结束索引：再次遍历字符串，对于每个字符，更新end为当前字符最后出现位置和end的较大值。这确保了当前片段包含了该字符的所有出现。
确定片段边界并记录长度：当当前遍历的索引与end相等时，意味着已经到达了一个片段的边界。此时，将当前片段的长度（end - start + 1）记录下来，并更新start为end + 1，以便开始下一个片段的查找。
重复步骤 3 和 4 直到结束：继续遍历直到字符串结束，每次找到一个片段边界时，都记录下该片段的长度。
返回结果：最终返回一个包含所有片段长度的列表。

这种方法的时间复杂度为 O(N)，其中 N 是字符串的长度，因为每个字符被遍历了两次。空间复杂度取决于字符集的大小，在本题中是 O(1)，因为输入字符串只包含小写字母，所以哈希表的大小最多为 26。

hot-100