Selenium is an open-source automation testing tool used to test web applications across different browsers and platforms. It supports multiple programming languages like Java, Python, C#, Ruby, JavaScript, etc., and allows testers to automate browser actions such as clicking, typing, navigating, etc.
Selenium is not a single tool — it’s a suite of tools for automating web application testing. It includes the following 4 main components:
WebDriver is the main interface in Selenium that is used to control the browser.
WebElement is used to interact with individual HTML elements (buttons, input fields, links, etc.).
get(String url) – Open a URL in the browsergetTitle() – Get the current page titlegetCurrentUrl() – Get the current page URLnavigate().to(String url) – Navigate to a URLnavigate().back() – Go back to the previous pagenavigate().forward() – Go forward to next pagenavigate().refresh() – Refresh the current pagemanage().window().maximize() – Maximize the browser windowclick() – Click the elementsendKeys(String text) – Enter text into an input fieldclear() – Clear the input fieldgetText() – Get the visible text of the elementgetAttribute(String name) / getDomProperty(String name) – Get attribute value of the elementisDisplayed() – Check if the element is visibleisEnabled() – Check if the element is enabledisSelected() – Check if a checkbox or radio button is selectedNoSuchElementException if no match is foundList<WebElement>) & finds all matching elements on the pagedriver.get("https://omrbranch.com");get()driver.navigate().to("https://omrbranch.com");Locators are used to identify elements on a web page (like buttons, input fields, links, etc.). Selenium provides 8 types of locators:
XPath (XML Path Language) is used in Selenium to locate elements in an HTML document using path expressions.
html) and follows the entire path/// In Selenium, to perform mouse actions like hover or drag-and-drop, I use the Actions class. First, I create an Actions object:
Actions actions = new Actions(driver); Then I use methods such as moveToElement() for hover, click(), doubleClick(), or contextClick() depending on the action needed. Finally, I always use perform() to execute the complete action. This is especially useful when working with hover-based menus, sliders, or performing complex mouse interactions.
For mouse actions like double-click and right-click, I again use the Actions class. To double-click on an element:
actions.doubleClick(element).perform(); For right-click (context click):
actions.contextClick(element).perform(); These interactions are very helpful when dealing with custom menus or options on a webpage.
When the webpage has frames or iframes, we cannot directly interact with elements inside them unless we switch the focus. To do that, I use:
driver.switchTo().frame(); We can switch using the frame’s index, name/id, or even a WebElement reference. Once the task is complete:
driver.switchTo().defaultContent(); This is important in applications where forms or content are embedded inside frames.
To find all frames in a webpage:
List<WebElement> frames = driver.findElements(By.tagName("iframe")); int totalFrames = frames.size(); This helps when debugging or dynamically switching through multiple frames.
An Alert is a JavaScript-based browser pop-up that blocks interaction with the page until it’s handled. In Selenium, I handle it using:
driver.switchTo().alert(); To handle a simple alert, I switch to it using driver.switchTo().alert(). Then I use getText() if I need to read the message. Finally, I use accept() to click OK and close the alert. Since it’s a simple alert, there’s no Cancel or input option.
To handle a Confirm Alert in Selenium, I switch to the alert using driver.switchTo().alert(). This alert has both OK and Cancel buttons. I use accept() if I want to click OK, and dismiss() to click Cancel. It’s usually used when the application is asking for user confirmation before performing an action.
A Prompt Alert not only has OK and Cancel, but also allows text input. To handle it, I first switch using driver.switchTo().alert(). Then I use sendKeys("inputText") to enter data into the alert. Finally, I use either accept() to submit or dismiss() to cancel. This is helpful for alerts that ask for user details or search input.
If I'm inside a nested (child) frame and want to go up one level, I use driver.switchTo().parentFrame(). This takes me to the immediate parent frame of the current one.
To completely exit all frames and return to the main HTML page, I use driver.switchTo().defaultContent(). This command is essential when switching between frames and webpage content.
In cases where sendKeys() doesn’t work properly — like hidden elements or dynamic inputs — I use JavaScriptExecutor. I write a script like: First, I typecast WebDriver with JavaScriptExecutor and then run:
javascriptExecutor.executeScript("arguments[0].value='text';", element); This sets the value directly in the DOM. It’s a very reliable workaround when standard typing fails.
Signup