您的当前位置:首页正文

英文文献

2024-08-14 来源:榕意旅游网
Possibilities to Estimate Reliability in XML Format Data Communication in Web Based Distributed

Applications

Tuulikki Gyllensvard Tor Stalhane Department of Computer and Information Science (IDI) Norwegian University of Science and Technology (NTNU)

Trondheim, Norway

[tuulikki, stalhane] @idi.ntnu.no

Abstract— Different possibilities to establish reliability for a web application’s communication with an external service provider are examined in this paper. The focus lies on a distributed, two node connection that is using XML messages and various PHP extensions for the communication. Apart from manipulating XML messages or XML Schemas syntax or constraints, it is proposed that some traditional software reliability theory may be used to model the communication. Finally, some ideas on other approaches to model reliability are given.

Keywords; XML; reliability; data communication; quality; modeling

I. INTRODUCTION

The rest of the paper is organized as follows: Next section gives an introduction to DAIM and a background to the technologies employed by the DAIM web application in its communication with the external service provider. A background to reliability theory is also given. Section 3 discusses possibilities to decide the reliability of the DAIM XML data exchange, and some conclusions are drawn in section 4.

II. BACKGROUND

A.

Previous reliability research in the broader field of web services has mainly focused on XML grammar and syntax issues [1][2][3], on reliability and security issues stemming from that the service partners are dynamic, on SOAP messages, on achieving reliability through applying models for the design process, and on models based on meta-language information [4] [5]. Some of the reliability research for distributed systems is concerned with network issues, such as keeping the system running despite network failures. This kind of reliability theory is less applicable on a simple, two node distributed system where XML data communication between web applications that is in focus in this paper. This paper5 s case study web application DAIM (Digital archiving and submitting of master theses) uses PHP, XML and cURL (Client for URLs) for the data communication with an external printing service provider. The printer uses information provided by the student through the web interface and the thesis file to print the thesis. PHP supports both the XML and the cURL technologies with extensions such as SimpleXML and PHP/CURL, what makes them easy to use from within PHP [6].

In this paper it is attempted to combine knowledge of reliability theory from diverse communication and software research fields in order to develop a good and usable reliability model for this simple web based XML data communication application. The focus of the model will be on both the particularities of XML format communication from a web interface and on getting a quantitative measure.

DAIM

DAIM is a web application developed at our university to facilitate the administration for master theses. The student can apply to a thesis project, submit the thesis to a thesis database and have the final version printed by an external printer and delivered to the university. This paper explores ways of estimating the reliability of DAIM’s interaction with the external printer. B.

Curl and PHP/CURL

With PHPs Curl extension (PHP/CURL), a variety of web resources can be used from within a PHP script. Curl can use several protocols, for instance FTP and HTTP. The connection to the other system is done in four steps; initialization, set options, execute, and close [7]. The set option part is the one demanding most elaboration. For the part of DAIM, the data taped in by the student or obtained from database records, is sent over http via different curl setopt arguments such as CURLOPTPOST, CURLOPTPUT, and CURLOPTlPOSTFIELDS (works like regular HTTP post, put and postfields). What protocol to be used can be specified by the user in a URL option. Curl is quite widely supported and offers a convenient solution for this type of data exchange, where XML documents and URLs are sent between to applications. In DAIM, Curl is used to send and receive XML documents instead of using for instance SOAP messages.

l-4244-0614-5/07/$20.00 ©2007 IEEE.

C.

XML, Parsing, XPath and PHP Extensions

A XML document consists of user-defined elements; tags and data. A XML documents first tags are a declaration and processing instructions to the software reading the document. Typically this includes references to possible namespaces via URIs and references to a XML Schema if used. Namespaces are referring to a vocabulary specified by an URI and are used in order to avoid name collisions of elements and attributes. With a namespace data that is tagged with the same tag can be separated. No namespaces are needed in DAIM. A XML Schema can be seen as a grammar specifying constraints for the documents that are to be validated against that XML Schema. Figure 1 shows how this is done in DAIM. A XML document has to be both well-formed and valid. Well-formed means no missing end tags, no overlapping of elements, quoted attribute values etc, and valid means it is structured and obeys the constraints specified in the XML Schema [8][9].

In order to handle a XML document, an application needs to parse the document to an in-memory representation, often the XPath data model. Searches and operations can be performed on the data model, which later can be transformed back to a XML document. The DOM (document object model) is also used in DAIM to get a platform and browser independent representation. The PHP extension SimpleXML, again with its own representation, is also used. A XML document is sequential whereas the DOM and the XPath data models have tree structures [9]. Many test designs are also inspired by the possibilities the tree form presents [1][2][3]. D.

The XML Data Communication in DAIM

The student delivers her master thesis to the external printer via a PHP web interface. Figure 2 shows the request part of the communication. For each of the XML messages this procedure is repeated. The communication with the printer is done through a PHP file that, with the help of different PHP extensions transforms the user input data, opens a Curl connection and closes it when finished. DAIM uses four XML Schemas for the messages sent to the printer:

A small schema for the number of pages of the thesis and the number of color pages is sent to the printer for price information.

• A reply XML Schema with the price information. • Schema for the ordering printed copies of the thesis. •

A preview schema which sends back an URL to the preview.

All the XML Schemas for DAIM^ XML communication are stored at the sending application in order to avoid inconsistency stemming from making changes at one place and not the other. This means that the validations conducted before sending and after receiving are done against the same XML Schema (see Fig. 1). The validation checks for example if the data types are correct, if the number of elements is correct and if the order of the elements is correct. All instances

of the XML documents have a unique id as a first element to keep track on the orders.

Figure 1. Schema validation. The message is validated twice, once before being sent and once after.

The XML document is generated from a user filled-in PHP form with the help of PHP^ DOM extension and XSLT. For the price request, the user specifies the number of pages wanted printed in color. The XML price request-document is generated and sent to the printer as soon as JavaScript detects that the user5s mouse has left the field with number of color pages. When the response message is coming back with the price suggestion to the client, the XML message is converted to a DOM model from which XPath can extract needed data, such as the price, to be displayed by PHP. For the thesis submitting files can be in the form of zip, PDF, or jpg. The URL pointing to where the thesis is store is submitted. Checksum controls the received message with the URL to the thesis PDF for possible errors that may have aroused when the message was sent through the channel. If the printer service is closed or down when receiving an order, the order is stored in a queue. The web interface with demos can be found at [10].

E. System Reliability

Several reliability research fields can meet in the task of finding a good way to model and estimate the reliability of a web-based XML data exchange: the classical software reliability field such as described in for example [11] and [12]; tests and theory on XML document validity and other syntactic checks to test the exchange, as done in some recent works [1][2][3]; and network reliability, focusing for example on availability although some links of the system are down, recovering from package loss or correct order of packages [13]. Reliability within the network community also means availability. Software reliability is located within the broader Software Quality field together with for instance security,

availability and usability as discussed in [12] or in [14], where

Figure 2. The data flow between the web application interface and the channel.

consisting of several alternatives in particular are discussed. Previous work on reliability of receiving software does not develop a quantitative reliability measure. The branch of traditional software reliability theory that stems from system theory, often measure reliability as MTTF (Mean Time To Failure) or MTBF (Mean Time Between Failures) [12]. This time-based measure is a good measurement for hardware components. For the task of describing a software system with user interaction it has some flaws. Although there are many models for software reliability, no standard has been established [12].

Reliability models also depend heavily on the intention of the measure; is it to be used during the design phase? Between two releases of an application? Or is it to be used during testing to decide when a product is good enough? Or on finished products to compare them? In the web service field, emphasis is put on designing reliable services through WS RELIABILITY for instance [15]. A reliability standard like WS-RELIABILITY, that guarantees message delivery, no duplicate delivery etc, is tied to the SOAP protocol. Not only what reliability mean in different contexts, but also what a failure or an error is, is discussed and several definitions coexists.

One popular approach to test communication where XML messages are involved is to use different mapping techniques, i_e_ transforming 仕XML document or 仕XML Schema to a formal model, often a tree structure model wkh constraints [1][2][3][16]. Constraints can be of the kind max occurrences and number of digits. For example, in DAIM^ price request, all dements should have a single occurrence according to the Schema specification. These techniques have so far only been used for testing purposes, but can be extended to measure how well a receiving software handles incorrect incoming messages.

As a sum-up, examining the reliability for a system like DAIM offers several challenges and decisions. Among other, to model heterogenic components, deciding the most critical part of the system, and to decide what the exact purpose of the model and measure is.

III. RELIABILITY OF DAIM’S XML EXCHANGE

Fig. 2 can be used as a very large-scale model to make control flow or process algebraic reasoning on the XML data communication’s request part. In [2], three strategies a responding software can have when encountering an unexpected message are proposed: •

• Reject the message.

Modify the message so that it conforms to the XML Schema.

• Process the correct parts of the message and leave the

rest.

The errors can come from XML Schema errors or channel errors. Errors in the generated documents are discovered by the XML Schema check before sending off the message. For the time being, DAIM is only capable of the first point: rejecting messages that do not comply with the Schema. In addition, an error message is produced. For the time being we do not know how well or how correctly the rejection is done. This would demand testing like the one proposed in for example [1]_ Developing DAIM to be able to handle the second point would increase the availability and also the reliability provided that the message is correct from the beginning. Fulfilling the third point would also be a possible development for DAIM in the cases where it is nonfundamental parts of the message that are erroneous, such as some of the administrative data for the ordering of the printing of the thesis. However, this would decrease the security of the process.

Several tests of XML messages have been done through mutating or perturbating data and then observe the reaction of the responding software. The data can be perturbated through changing for instance the types. The kind of syntactic errors that can be introduced in a XML document are limited due to the limit in allowed constraints. In [1], a framework for testing XML data communication in addition to testing RPC is proposed. Cardinality and data types are checked. The regular tree grammar (RTG) is used as formal model for the test design. In [2], the same authors proceed their research by manipulating XML Schemas in order to produce invalid messages and to test the receiving software^ ability to handle this. In [3], just as in [1] and [2] a transformation to a RTG model is used. The RTG model is somewhat extended and proper fault classes are made to characterize the tests and the faults. Papers [1][2][3] designs tests for web services which above all are interesting web service environments, where the messages are produced by several different sources and the responding application has to deal with that. In order to estimate the reliability from the test results, a framework for doing that has to be designed. This would be one way of testing DAIM. However, these errors are unlikely to emerge by themselves in the system, since the software generating the XML messages are controlled. This kind of testing would also be different from traditional software testing since there is no user interface for the interaction on this level. To increase the powerfulness of mapping techniques in test cases design more research is needed.

In [16], a framework for using process algebra to model and to reason on web service systems is outlined. The idea with process algebra is that it visualizes a system in an other way than a graphical model of the same system would do and thus offers other tools to reason on the model. During the design phase, software can be validated with the help of process algebra. A design phase reliability reasoning for, for instance, a second release of DAIM, is not to far away as a way to make use of process algebra. There are several process algebraic languages that highlight and have different possibilities. Finite

State Machines are well known as a tool for modeling systems. Process algebra can be used as an alternative to FSM when messages and processes are to be modeled. However, as with the RTG or other tree models, process algebra is an abstraction that requires both good skill and familiarity with the language and the ability to make use of it for desired purposes. Modeling DAIM^ message exchanges and software processes with process algebra as described in [16], would not clarify the interactions so much, since the complexity of the interactions is relatively low, with four request-respond exchanges. It is a lot of work between a good process algebraic description of a systemandamethodtoestimatereliabilityofthatsystem.

Another approach, proposed in [17], would be to model the responding software^ behavior with a Markov model. This model builds on the framework developed in [18], where the total reliability is calculated from individual code components5 reliabilities. The model in [17] has the advantage in that it weights in how often the components actually are used. A completely failure free system has the reliability 1, a failuring system the reliability 0. The interval [0,1] is used in many reliability models. Models of this kind are feasible options if the code is available or at least a model of the code, i. e. quite late in the design process. Applying this method on DAIM would mean making a control flow model of the code involved in the process in for example Figure 2, making sure the components reliability do not depend on previous components in this model and then calculate the reliability:

Rtot=n (rO*.

The total reliability is estimated from the probability to failure of the individual components and weighted with the probability of use of that particular component, assuming the independence of the components. For the case of the XML exchange, the reliability of the validation component depends on how well it handles the Schema validation, both for valid incoming messages and messages that are invalid. Since this is done by different Schemas for the different messages this should not be modeled as the same component. It is possible to extend the Fig. 2 meta-model to also cover the responding application^ components and the channel. This would be interesting to do, since it contains more heterogen material than the PHP and JavaScript components used in [16]. This requires access to both requesting and responding software. Compared to the syntactic tests discussed above,仕》is me仕》od gives a quantitative measure of the reliability and includes a greater part of the system than just the validation mechanism.

Apart from modeling web applications with a Markov model, we have modeled web applications with Bayesian Belief Networks within our project [19]. For estimating the reliability for DAIM^ printing task this would however probably be a less feasible approach since the printing task has a straightforward flow, with few options.

Other models based on traditional software theory exploring reactions on unanticipated input would also be suitable here. MTTF and MTBF are not appropriate here, since the connection is used intermittently and scarcely. It would also

be possible to use the most bare scraped reliability formula and apply it to for instance input test results:

Rt〇

t—Ts/Ttot.

Where Ts are successful tests and Ttot are the total number of tests. This formula could be used on the whole message sequence it takes to submit a thesis for printing as well as on singular message exchanges. Input testing can try different values, also values that according to the specification should not be allowed, and estimate the reliability from that. Advantages with input testing is that the reliability measure is a measure based on how well the system works from the user and that the measure is not based on for the user abstract number of errors in the code and that it is easier to perform than Schema perturbations for instance. Disadvantages are that the method does not test XML specifically, some of the inputs will be checked by JavaScript.

Other reliability frameworks from the software family concentrate on number of errors in the software [12]. This is not a measure on how good the software actually is working when used. Nor does it consider what parts of a software that are used more than other as in models where usage is integrated such as in for example [20].

IV. CONCLUSIONS

Several approaches to estimate reliability of a distributed XML data communication application have been examined in this paper. It has been found that models from traditional software reliability theory can be used with varying results. Time-based models proved to be less successful, whereas FSM or control flow models proved useful for modeling the whole chain of events between the web application interface and the printer. It is also possible to look at parts of the data communication chain, this would make the errors more traceable, which can be interesting during for example testing. The use of process algebra for modeling data exchange was found to be a little explored approach. Together with direct manipulations of XML messages, done to test XML Schema validations, it merits to be explored and developed further. It has also been discussed how a usable reliability measure can be designed. The problem with many of today’s reliability measures, are that they lack practical usability. Future possible uses of XML in DAIM will be when the theses are to be made available in bigger libraries. For this application it can be interesting to attach meta-data to the theses. This will offer new possibilities to study the system, taking meta-data aspects into account.

REFERENCES

[1] J. Offut and W. Xu, \"Generating test cases for web services using data

pertubation,” \"Workshop on Testing,Analysis and Verification of \"Web Services, Boston, 2004.

[2] X. Wuzhi, J. Offut, and J. Luo \"Testing web services by XML

perturbation,Proceedings of the 16th IEEE International Symposium on Software Reliability.

[3] UM. C. Emer, S. R. Vergilio, and M. Jino UA testing approach for XML

schemas,in Proceedings of the 29th Annual International Computer Software and Applications Conference (COMPSAC’05).

[4] J. Zhang, uAn approach to facilitate reliability testing of web services

components,’,in Proceedings of the 15th International Symposium on Software reliability engineering.

[5] Bai and Dong “\"WSDL-based automatic test case generation for web services

[6][7] testing”

http ://curl.haxx. se/

P. Hudson, PHP in a Nutshell, 1st ed. CA: O5Reilly, 2006.

[8] M. C. Daconta, L. J. Obrst, and K. T. Smith, The Semantic Web - A Guide to the

Future of XML, Web Services, and Knowledge Mangagement, Indiana: Wiley, 2003.

[9] E. R. Harold and S Means,XML in a Nutshell,3rd ed. CA: O’Reilly,

[10][11] 2004. http://daim.idi.ntnu.no/J. Musa, Software Reliability Engineering, NY: McGraw-Hill, 1999.

^ [12] G. O5Regan, Mathematical Approaches to Software Quality, London:

Springer-Verlag, 2006.

[13] K. P. Birman, Reliable Distributed Systems, NY, Heidelberg, Berlin: Springer,

2005.

[14] W. Abramowicz,M. Kaczmarek,and D. Zyskowski “Duality in web sevices

reliability,’,in Proceedings of the Advanced International

Conference on Telecommunications and International Conference on Internet

and Web Applications and Services (AICT/ICIW 2006).

[15] Gdnczy,Ldszl6 et al. “Model based deployment of web services to standards-compliant reliablie middleware” Proceedings of IADIS international Conference on WWW/Intemet 2006 Murcia, Spain, vol 1, 2006.

[16] G. Salatin,L. Bordeaux,M. Schaerf “Describing and reasoning on web services using process algebra,” in Proceedings of the IEEE International Conference on Web Services (ICWS’04). [17] T. Gyllensvard,“A web application model for software reliability,’’ Proceedings of IADIS international Conference on WWW/Intemet 2006 Murcia, Spain, vol 2, 2006. [18] R. C. Cheung,“A user-oriented software reliability model,” in IEEE

Transactions on Software Engineering. Vol. SE-6, No. 2., 1980, pp 118125. 、 [19] I. Canova Calori,T. StMhane,and S. Ziemer “ Robustness analysis using FMEA and BBN- case study for a web based application,” unpublished.

[20]

C. Kallepalli and J. Tian “Measuring and modeling usage and reliability for statistical web testing,” in IEEE Transactions on Software Engineering. Vol. 27 No. 11, 2001, pp. 1023-1036.

因篇幅问题不能全部显示,请点此查看更多更全内容