Cybernetics And Systems Analysis logo
Інформація редакції Аннотації статей Автори Архів
Кібернетика і Системний Аналіз
Міжнародний Науково-Теоретичний Журнал
УДК 004.22+004.93'11
Д.А. Рачковський

ШВИДКИЙ ПОШУК СХОЖИХ ГРАФІВ ЗА ВІДСТАННЮ РЕДАГУВАННЯ

Анотація. Наведено огляд індексних структур для швидкого пошуку за схожістю об’єктів, поданих деревами та графами. Як міру схожості використано відстань редагування. Розглянуто виконання запитів точного пошуку за схожістю. В основному описано алгоритми на основі стратегії фільтрації та уточнення, які використовують обернене індексування. Крім того, розглянуто алгоритми точного обчислення відстані редагування графів та її нижніх і верхніх меж.

Ключові слова: пошук за схожістю, графи, відстань редагування, найближчий сусід, індексні структури, обернене індексування.



ПОВНИЙ ТЕКСТ

Рачковский Дмитрий Андреевич,
доктор техн. наук, ведущий научный сотрудник Международного научно-учебного центра информационных технологий и систем НАН Украины и МОН Украины, Киев, dar@infrm.kiev.ua


СПИСОК ЛІТЕРАТУРИ

  1. Rachkovskij D.A. Index structures for fast similarity search for symbolic strings. Cybernetics and Systems Analysis. 2019. Vol. 55, N 5. P.

  2. Rachkovskij D.A. Real-valued vectors for fast distance and similarity estimation. Cybernetics and Systems Analysis. 2016. Vol. 52, N 6. P. 967–988.

  3. Rachkovskij D.A. Binary vectors for fast distance and similarity estimation. Cybernetics and Systems Analysis. 2017. Vol. 53, N 1. P. 138–156.

  4. Rachkovskij D.A. Distance-based index structures for fast similarity search. Cybernetics and Systems Analysis. 2017. Vol. 53, N 4. P. 636–658.

  5. Rachkovskij D.A. Index structures for fast similarity search for binary vectors. Cybernetics and Systems Analysis. 2017. Vol. 53, N 5. P. 799–820.

  6. Rachkovskij D.A. Index structures for fast similarity search for real-valued vectors. I. Cybernetics and Systems Analysis. 2018. Vol. 54, N 1. P. 152–164.

  7. Rachkovskij D.A. Index structures for fast similarity search for real-valued vectors. II. Cybernetics and Systems Analysis. 2018. Vol. 54, N 2. P. 320–335.

  8. Rachkovskij D.A., Slipchenko S.V. Similarity-based retrieval with structure-sensitive sparse binary distributed representations. Comp. Intelligence. 2012. Vol. 28, N 1. P. 106–129.

  9. Bille P. A survey on tree edit distance and related problems. Theoretical Computer Science. 2005. Vol. 337, N 1–3. P. 217–239.

  10. Tai K.-C. The tree-to-tree correction problem. Journal of the Association for Computing Machinery (JACM). 1979. Vol. 26. P. 422–433.

  11. Levenshtein V.I. Binary codes capable of correcting deletions, insertions, and reversals. Soviet Physics — Doklady. 1966. Vol. 10, N 8. P. 707–710.

  12. Gao X., Xiao B., Tao D., Li X. A survey of graph edit distance. Pattern Analysis and Applications. 2010. Vol. 13, N 1. P. 113–129.

  13. Sanfeliu A., Fu K.S. A distance measure between attributed relational graphs for pattern recognition. IEEE Trans. Syst. Man. Cybern. 1983. Vol. 13, N 3. P. 353–362.

  14. Zhang K., Jiang T. Some MAX SNP-hard results concerning unordered labeled trees. Information Processing Letters. 1994. Vol. 49. P. 249–254.

  15. Pawlik M., Augsten N. Rted: a robust algorithm for the tree edit distance. Proceedings of the VLDB Endowment. 2011. Vol. 5, N 4. P. 334–345.

  16. Pawlik M., Augsten N. Tree edit distance: Robust and memory-efficient. Information Systems. 2016. Vol. 56. P. 157–173.

  17. Kailing K., Kriegel H.-P., Schonauer S., Seidl T. Efficient similarity search for hierarchical data in large databases. Proc. EDBT’04. 2004. P. 676–693.

  18. Berchtold S., Keim D., Kriegel H.P. The X-tree: An index structure for high-dimensional data. Proc. VLDB’96. 1996. P. 28–39.

  19. Yang R., Kalnis P., Tung A.K.H. Similarity evaluation on tree-structured data. Proc. SIGMOD’05. 2005. P. 754–765.

  20. Guha S., Jagadish H.V., Koudas N., Srivastava D., Yu T. Integrating XML data sources using approximate joins. ACM Trans. Database Syst. 2006. Vol. 31, N 1. P. 161–207.

  21. Akutsu T., Fukagawa D., Takasu A. Approximating tree edit distance through string edit distance. Algorithmica. 2010. Vol. 57, N 2. P. 325–348.

  22. Tang Y., Cai Y., Mamoulis N. Scaling similarity joins over tree-structured data. Proc. VLDB Endowment. 2015. Vol. 8, N 11. P. 1130–1141.

  23. Zeng Z., Tung A.K.H., Wang J., Feng J., Zhou L. Comparing stars: On approximating graph edit distance. Proc. VLDB Endowment. 2009. Vol. 2, N 1. P. 25–36.

  24. Zhao X., Xiao C., Lin X., Wang W. Efficient graph similarity joins with edit distance constraints. Proc. ICDE’12. 2012. P. 834–845.

  25. Gouda K., Arafa M. An improved global lower bound for graph edit similarity search. Pattern Recogn. Lett. 2015. Vol. 58. P. 8–14.

  26. Wang G., Wang B., Yang X., Yu G. Efficiently indexing large sparse graphs for similarity search. IEEE Trans. Knowledge and Data Engineering. 2012. Vol. 24, N 3. P. 440–451.

  27. Bougleux S., Gauzere B., Blumenthal D.B., Brun L. Fast linear sum assignment with error-correction and no cost constraints. Pattern Recogn. Lett. https://doi.org/10.1016/j.patrec.2018.03.032.

  28. Kuhn H.W. The Hungarian method for the assignment problem. Naval Research Logistics Quarterly. 1955. Vol. 2, N 1–2. P. 83–97.

  29. Munkres J. Algorithms for the assignment and transportation problems. Journal of the Society for Industrial and Applied Mathematics. 1957. Vol. 5, N 1. P. 32–38.

  30. Wang X., Ding X., Tung A.K.H., Ying S., Jin H. An efficient graph indexing method. Proc. ICDE’12. 2012. P. 210–221.

  31. Zhao X., Xiao C., Lin X., Wang W., Ishikawa Y. Efficient processing of graph similarity queries with edit distance constraints. VLDB J. 2013. Vol. 22. P. 727–752.

  32. Riesen K., Fankhauser S., Bunke H. Speeding up graph edit distance computation with a bipartite heuristic. Proc. MLG’07. 2007. P. 21–24.

  33. Zheng W., Zou L., Lian X., Wang D., Zhao D. Efficient graph similarity search over large graph databases. IEEE TKDE. 2015. Vol. 27, N 4. P. 964–978.

  34. Li Z., Jian X., Lian X., Chen L. An efficient probabilistic approach for graph similarity search. Proc. ICDE’18. 2018. P. 533–544.

  35. Qin J., Xiao C. Pigeonring: a principle for faster thresholded similarity search. Proc. VLDB Endow. 2018. Vol. 12, N 1. P. 28–42.

  36. Zhao X., Xiao C., Lin X., Zhang W., Wang Y. Efficient structure similarity searches: A partition-based approach. The VLDB Journal. 2018. Vol. 27, N 1. P. 53–78.

  37. Liang Y., Zhao P. Similarity search in graph databases: A multilayered indexing approach. Proc. ICDE’17. 2017. P. 783–794.

  38. Abu-Aisheh Z., Raveaux R., Ramel J.-Y. Efficient k-nearest neighbors search in graph space. Pattern Recognition Letters. https://doi.org/10.1016/j.patrec.2018.05.001.

  39. Blumenthal D.B., Gamper J. Improved lower bounds for graph edit distance. IEEE TKDE. 2018. Vol. 30, N 3. P. 503–516.

  40. Blumenthal D.B., Gamper J. On the exact computation of the graph edit distance. Pattern Recognition Letters. 2018. https://doi.org/10.1016/j.patrec.2018.05.002/.

  41. Abu-Aisheh Z., Gauzere B., Bougleux S., Ramel J.-Y., Brun L., Raveaux R., Heroux P., Adam S. Graph edit distance contest: Results and future challenges. Pattern Recognition Letters. 2017. Vol. 100. P. 96–103.

  42. Riesen K., Neuhaus M., Bunke H. Bipartite graph matching for computing the edit distance of graphs. Proc. GbRPR’07. 2007. P. 1–12.

  43. Riesen K., Bunke H. Approximate graph edit distance computation by means of bipartite graph matching. Image and Vision Computing. 2009. Vol. 27, N 7. P. 950–959.

  44. Abu-Aisheh Z., Raveaux R., Ramel J.Y., Martineau P. An exact graph edit distance algorithm for solving pattern recognition problems. Proc. ICPRAM’15. 2015. P. 271–278.

  45. Abu-Aisheh Z., Raveaux R., Ramel J.-Y., Martineau P. A parallel graph edit distance algorithm. Expert Systems with Applications. 2018. Vol. 94. P. 41–57.

  46. Gouda K., Hassaan M. А novel edge-centric approach for graph edit similarity computation. Information Systems. 2019. Vol. 80. P. 91–106.

  47. Chen X., Huo H., Huan J., Vitter J.S. An efficient algorithm for graph edit distance computation. Knowledge-Based Systems. 2019. Vol. 163. P. 762–775.

  48. Zhou R., Hansen E.A. Beam-stack search: Integrating backtracking with beam search. Proc. ICAPS’05. 2005. P. 90–98.

  49. Chang L., Feng X., Lin X., Qin L., Zhang W. Efficient graph edit distance computation and verification via anchor-aware lower bound estimation. arXiv:1709.06810. 1 Oct 2017.

  50. Justice D., Hero A. A binary linear programming formulation of the graph edit distance. IEEE Trans. Pattern Anal. Mach. Intell. 2006. Vol. 28, N 8. P. 1200–1214.

  51. Lerouge J., Abu-Aisheh Z., Raveaux R., Heroux P., Adam S. New binary linear programming formulation to compute the graph edit distance. Pattern Recognition. 2017. Vol. 72. P. 254–265.

  52. Darwiche М., Raveaux R, Conte D., T’Kindt V. Graph Edit Distance in the exact context. Proc. S+SSPR’18. 2018. P. 304–314.

  53. Carletti V., Gauzere B., Brun L., Vento M. Approximate graph edit distance computation combining bipartite matching and exact neighborhood substructure distance. Proc. GbRPR’15. 2015. P. 188–197.

  54. Blumenthal D., Bougleux S., Gamper J., Brun L. Ring based approximation of graph edit distance. Proc. S+SSPR’18. 2018. P. 293–303.

  55. Bougleux S., Brun L., Carletti V., Foggia P., Gauzere B., Vento M. Graph edit distance as a quadratic assignment problem. Pattern Recognition Letters. 2017. Vol. 87. P. 38–46.

  56. Daller E., Bougleux S., Gauzere B., Brun L. Approximate graph edit distance by several local searches in parallel. Proc. ICPRAM’18. 2018. P. 149–158.

  57. Darwiche M., Conte D., Raveaux R., T’Kindt V. A local branching heuristic for solving a graph edit distance problem. Comp. & Oper. Res. https://doi.org/10.1016/j.cor.2018.02.002.

  58. Gouda K., Arafa M., Calders T. A novel hierarchical-based framework for upper bound computation of graph edit distance. Pattern Recognition. 2018. Vol. 80. P. 210–224.

  59. Slipchenko S.V., Rachkovskij D.A. Analogical mapping using similarity of binary distributed representations. Information Theories & Applications. 2009. Vol. 16, N 3. P. 269–290.

  60. Rachkovskij D.A. Some approaches to analogical mapping with structure-sensitive distributed representations. Journal of Experimental & Theoretical Artificial Intelligence. 2004. Vol. 16, N 3. P. 125–144.

  61. Rachkovskij D.A. Formation of similarity-reflecting binary vectors with random binary projections. Cybernetics and Systems Analysis. 2015. Vol. 51, N 2. P. 313–323.

  62. Рачковський Д.А., Гриценко В.І. Розподілене подання векторних даних на основі випадкових проекцій. Київ: Інтерсервіс, 2018. 216 c.

  63. Rachkovskij D.A., Revunova E.G. A randomized method for solving discrete ill-posed problems. Cybernetics and Systems Analysis. 2012. Vol. 48, N 4. P. 621–635.

  64. Revunova E.G. Model selection criteria for a linear model to solve discrete ill-posed problems on the basis of singular decomposition and random projection. Cybernetics and Systems Analysis. 2016. Vol. 52, N 4. P. 647–664.

  65. Revunova E.G. Averaging over matrices in solving discrete ill-posed problems on the basis of random projection. Proc. CSIT’17. 2017. P. 473–478.

  66. Riba P., Llados J., Fornes A., Dutta A. Large-scale graph indexing using binary embeddings of node contexts for information spotting in document image databases. Pattern Recognition Letters. 2017. Vol. 87. P. 203–211.

  67. Narayanan A., Chandramohan M., Venkatesan R., Chen L., Liu Y., Jaiswal S. Graph2vec: Learning distributed representations of graphs. Proc. MLG’17. 2017. P. 21:1–21:8.

  68. Goyal P., Ferrara E. Graph embedding techniques, applications, and performance: A survey. Knowledge Based Systems. 2018. Vol. 151. P. 78–94.

  69. Wu Z., Pan S., Chen F., Long G., Zhang C., Yu P.S. A comprehensive survey on graph neural networks. arXiv:1901.00596. 10 Mar. 2019.

  70. Bai Y., Ding H., Bian S., Chen T., Sun Y., Wang W. SimGNN: A neural network approach to fast graph similarity computation. Proc. WSDM’19. 2019. P. 384–392.
© 2019 Kibernetika.org. All rights reserved.